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A Study On Distributed 
Model Predictive Consensus 

Tamas Keviczky* and Karl Henrik Johansson 

Abstract 

We investigate convergence properties of a proposed distributed model predictive control (DMPC) 
scheme, where agents negotiate to compute an optimal consensus point using an incremental subgradient 
method based on primal decomposition as described in [1], [2]. The objective of the distributed control 
strategy is to agree upon and achieve an optimal common output value for a group of agents in the 
presence of constraints on the agent dynamics using local predictive controllers. Stability analysis using 
a receding horizon implementation of the distributed optimal consensus scheme is performed. Conditions 
are given under which convergence can be obtained even if the negotiations do not reach full consensus. 

I. Introduction 

Engineered systems are becoming increasingly complex and larger in size, which presents a 
need for the distribution of decision-making processes that interact with or are part of these large- 
scale technologies and applications. An important problem that arises among such distributed 
decision-making systems (often called agents), is related to consensus-seeking and rendezvous, 
which has received a high level of interest in the recent literature [3]. The consensus-seeking 
and rendezvous problem consists of designing distributed control strategies such that the state 
or output of a group of agents asymptotically converge to a common value, a consensus point, 
which is agreed upon either a priori or on-the-fly using some negotiation scheme. In this paper, 
we assume that a consensus point is not fixed in advance, but is rather determined by an 
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optimal control problem. We focus on the combination of model predictive controllers and 
subgradient-based negotiation of optimal consensus (along the lines of the work in [2]), and in- 
vestigate conditions for asymptotic convergence of such distributed control schemes. We propose 
an algorithm for distributed model predictive consensus, which guarantees convergence under 
reasonable assumptions given a sufficient number of subgradient iterations can be performed 
without interruption. 

We will model agents as constrained linear dynamical systems and build on the decentralized 
negotiation algorithm described in [2] to compute exactly or at least approach the optimal 
consensus point. This negotiation algorithm relies on primal decomposition of the optimal 
consensus and control problem and makes use of distributed implementation of an incremental 
subgradient method. Each agent performs individual planning of its trajectory and negotiates 
with neighbors to find an optimal or near optimal consensus point, before applying a control 
signal. 

The paper is structured as follows. Section [n] introduces the optimal consensus problem and 
some basic notation and assumptions. The decentralized negotiation scheme of [2] is summa- 
rized in Section [Till along with a decentralized receding horizon implementation of the optimal 
consensus problem. Stability of the proposed decentralized negotiation and control scheme is 
studied in Section [IV] for both converged and interrupted negotiations. Finally, Section IVl presents 
a numerical simulation example, which illustrates the approach for an aerial refueling scenario, 
and Section [VI] the conclusions. 

II. Problem Formulation 

Consider N > 1 dynamic agents whose dynamics are described by the following discrete-time 
state equations 

x t+i = A l x l t + B % u\, 

(1) 

y'i = cx, 

for % = 1, . . . , N, where A* G M. nlxn \ B i G R™ 1 *™ 1 and C l G W xn \ We assume that the states 
and inputs of each agent are constrained to lie in polyhedral sets 

x\ G X\ u\ G U\ t> 0. (2) 
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Definition 1: [2] The dynamic agents described by CQ) reach consensus at time T if 

y l T+ k = 0, Vk>0, i = l,...,N, 

(3) 

u l T+k = u l T , Wk>0, i = l,...,N, 
where 9 lies in a compact and convex set 6 C M p . 

In this paper, the consensus point 9 is a vector that specifies, for example, the position and 
velocity the agents shall converge to. 

Our objective is to find a consensus point 9 G C W and a sequence of inputs Uq, . . . , u % T _ x , 
with i = 1, . . . , N and u\ G W* for all £ = 1, . . . , T — 1, such that all agent outputs are equal at 
time T: 

yi r = 6, i = l,...,N. (4) 

We will also require each agent to be at an equilibrium at time T and denote the state and 
control equilibrium pairs of the z-th agent corresponding to a 9 value with (x % e (8),u l e (8)). The 
set of equilibria for each agent i — 1, . . . , N thus will be a function of 9 on the domain 0: 

£\9) = {x\{9)y e {9)) 

= \x e R n \u E M ml | x = A i x + B i u,C i x = dl . 
We assume that the following cost function is associated with the i-th system: 

v* (4, 4> e) = (4 - T Q l (4 - <(o)) 

(6) 

+ (4-<(0)) T i? (4-<W), 

where G ]R nl><nl and i? 1 G ]R mlxml are positive definite symmetric matrices (i.e., we penalize 
deviations from the equilibrium states corresponding to the consensus point and the use of control 
effort). 

Assumption 1: Each agent dynamics {A\B l ) is controllable and systems (A\(Q 1 )^) are 
observable. 

We then formulate the following finite-time optimal control problem at time t based on [2]: 
Problem 1: Let T > be fixed. Determine control vectors u l kt , k = 0, . . . ,T — 1, for all 
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i — 1, . . . , N and the consensus point 8 t , which solve the following optimization problem: 

N T-l 

EE ^ ^) 

*' * i=l k=0 

subj. to 4 +ljt = AV M + S*4 jt , (7a) 

< t e*\ fc = l,...,T, (7b) 

< t G^, fc = 0,...,T-l, (7c) 

Vh = (7d) 

4,t = 4(^), (7e) 

i = l,...,N, 

t e 6, (7g) 

where U t = [u ,t, • • • , wr-i,t] G M T ^* ml with it fe)t = [w^ t , . . . , u^ t ], denotes part of the optimiza- 
tion vector containing control inputs, x\ t denotes the state vector of the i-th agent predicted at 
time t + k obtained by starting from the state x\ and applying to system (OQ) the input sequence 
UQ t , . . . ,u\_ lt . The full optimization vector consists of the vector U t defined above and the 
consensus variable 9 t . The subscript t will be significant later in Section [OH when this problem 
will be solved repeatedly in a receding horizon fashion. 

By implementing the solution to Problem [H agents reach consensus at time T in the sense of 
Definition \T\ We will make the following assumptions on the feasibility of reaching the consensus 
point by all agents: 

Assumption 2: The rendezvous time horizon T is large enough so that all 8 t in the set 6 are 
feasible, i.e., reachable consensus equilibrium points for all agents. 

Assumption 3: For all 8 t £ and i = 1, . . . , N, there exists a sequence u % , . . . , w^-i m me 
relative interior of U l such that y l T = 8. 

This means that it should be possible to reach 8 t without saturating the control signal (not 
necessarily in an optimal way). 



February 29, 2008 



Technical Report 



5 



The solution of Problem Q] was distributed among the agents in [2] by using primal decom- 
position in combination with an incremental subgradient method [4]. First, a multiparametric 
solution of the individual optimization problems was defined as 



T-l 



Ut 



k=0 



(8) 



subj. to © - (7g]), k = 1, 
The optimal consensus problem in © can then be written as 



T-l. 



A? 



?*(zt) 



mm 

t 



i=i 



(9) 



subj. to t e 9. 
The set of optimal consensus points is defined as 



e* t = {e t ee 



N 



*£q i (xi,9 t ) = q*(x t ) 



(10) 



i=l 



It can be established that the cost function q l (-) defined in © is a convex function and a 
subgradient g l for q l {-) at 6*t is given by the Lagrange multipliers corresponding to the terminal 
point constraint. 

A principal method for solving problem ([8]) is the subgradient method 



9 t (k + i) = n 



e 



N 



e t {k)-a{k)Y,9\k) 



i=l 



(ID 



where g l (k) is a subgradient of q l at 9t(k), a(k) is a positive stepsize, and V@ denotes projection 
on the set C W. In the following, we will consider the incremental subgradient method 
proposed in [5]. It is similar to the standard subgradient method CCD), the main difference being 
that at each iteration k, 9 t is changed incrementally, through a sequence of N steps. Each step is 
a subgradient iteration for a single component function q\ and there is one step per component 
function. Thus, an iteration can be viewed as a cycle of iV subiterations. If 9 t (k) is the vector 
obtained after k cycles, the vector 9 t (k + 1) obtained after one more cycle is 



9 t (k+l)=^(k), 



(12) 



where $t(k) is obtained after the N steps 



4(k) = V e [tfT\k) - a(k)g%k)} , 



(13) 
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starting with 

$° t (k) = e t {k), (14) 

where dq l (xl,'&l~ 1 (k)) denotes the subdifferential (set of all subgradients) of q l at the point 
fi^ik). The updates described by (fl"3l) are referred to as the subiterations of the k-th cycle. 

We will make the following assumptions, which will allow us to formulate well-posed prob- 
lems and characterize the number of subgradient iterations needed for convergence to a certain 
tolerance. 

Assumption 4 (Existence of an Optimal Solution): The optimal solution set 6^ is nonempty. 
Assumption 5 (Subgradient Boundedness): There exists a scalar (5 such that 

llsil</3, (15) 

i = l,...,N t k>0, 
where N is the number of subiterations in each cycle. 

Since we assume that the set 6 is compact, Assumptions @] and |5] are automatically satisfied. 

Definition 2: We will denote the Euclidean distance from a point z to the set 0£ by dist(,2, 0jf). 

Definition 3: A function j(-), defined on nonnegative reals, is a K function if it is continuous, 
strictly increasing with 7(0) = 0. 

In the next section, we briefly describe the agreement mechanism of [2] and propose a closed- 
loop feedback control policy, which can be used in a receding horizon fashion, interleaved with 
subgradient-based negotiation of optimal consensus point updates. 

III. Decentralized Negotiation and Receding Horizon Implementation Scheme 

The optimal consensus point Q* t can be computed in a distributed way using the incremental 
subgradient method described in (fl"2l) - (fl4"l) . Reference [2] describes an algorithm, where an 
estimate of the optimal consensus point is passed around between agents. Upon receiving an 
estimate from its neighbor, an agent solves the optimization problem © to evaluate its cost 
of reaching the suggested consensus point and to compute an associated subgradient (Lagrange 
multiplier of terminal point constraint). The agent then performs a subiteration by updating the 
consensus estimate according to (fl"3l) and passing the estimate to the next agent. Each agent only 
computes a subgradient with respect to its own part of the objective function and not the global 
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objective function. The convergence of the incremental subgradient algorithm is guaranteed if 
the agents can be organized into a cycle graph (for more details see [2]). 

Remark 1: Besides some technical assumptions given in [1], the primal decomposition scheme 
and convergence to the optimal solution of © using sequential local subgradient iterations 
is possible due to decoupled and independently constrained agent dynamics. Furthermore, the 
overall objective function is decomposable into a sum of terms that share only a single coupling 
variable, 6 t . Thus fixing a 9 t value in the cost and constraints separates the optimal control 
problem into local ones. 

The control solution U£ corresponding to a negotiated optimal consensus point Q* t provides an 
open-loop control strategy for finite-time optimal consensus. However, this solution is sensitive to 
model mismatch and disturbances, which suggests considering a receding horizon implementation 
and repeated solution of the finite-time optimal consensus problem due to its feedback nature. 
Our goal in such an approach is to guarantee constraint fulfillment and asymptotic convergence 
to a consensus point by repeatedly solving optimal consensus problems and implementing the 
first sample of the control solution. 

More formally, let U* = [uq t , . . . , u^-i 1 ] and 9\ be an optimal solution of © at time t. Then, 
the first sample of £7 t * is applied to the collection of agents: 

u t = u* Qyt . (16) 

The optimization © is repeated at time t+1, based on the new state x t+ \. 

Remark 2: Stability of such a combination of DMPC and incremental subgradient methods is 
not a trivial question, especially since the terminal constraint value in the receding horizon scheme 
based on © is an optimization variable as well. The main point of the following investigation is 
to rule out a scenario where repeatedly solving and implementing the first step of a finite-time 
optimal control solution with changing terminal constraint value eventually results in divergence 
or lack of stability. Compared to the work in [1], this question arises because we are no longer 
considering only the open-loop implementation of a control sequence that terminates with the 
value u e (6l) at time T, but one that is updated every time step (along with 91), based on new 
measurements in a receding horizon fashion. 
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IV. Stability Analysis 

In this section we will be primarily interested in establishing conditions for asymptotic con- 
vergence of the combined DMPC and consensus algorithm to the set of equilibria defined as 

s= (s i (e),...,s N {e),e), eee. (17) 

A. Fully Converged Negotiations 

For now, we will assume that in each implementation cycle (i.e., at sampling time t), the 

distributed negotiations on the optimal consensus value Q\ have converged before the implemen- 
tation of the corresponding control actions. In other words, the optimal solution of problem © 
is attained by every agent in each time step by means of the distributed consensus algorithm 
of [1]. This allows us to consider the overall system as a whole for stability analysis, using the 
following aggregate dynamics 

x t+ i = Ax t + Bu t , 

(18) 

y t = Cx t , 

where A = diag(A i ) G REi"**!*" 1 , B = diag(F') G M^"' x S. m ' and C = diag(C7 i ) G 
W Nx ^i n \ The states and inputs of the overall system are constrained by 

x t e X = ]Jx\ u t eU = Y[U\ t > 0, (19) 

i i 

where the symbol Yl denotes the standard Cartesian product of sets. Note that according to ©, 
consensus for the aggregate system dynamics means yx = Cxt = In ® #i • 

Stability analysis in this case pertains to the study of the receding horizon control scheme 
given in © and (fl6l) with a terminal point constraint to one of its optimization variables 9 t . 
This will be performed next. 

The set of states at time k feasible for Problem \T\ is given by 

X k = {x | 3u G U such that Ax + Bu G X k+1 ] n X, 

with (20) 
X T _i = {x | 3u 6W and 9 G such that 

x = Ax + Bu and C(Ax + Bu) = 1 N <g> 6} n X. 
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Denote with 

c(x t ) = u* ot , (21) 

the control law obtained by applying the receding horizon control policy in © and (fT6l) with cost 
function © for each agent, when the current state is x t — [x\, . . . , ]. Consider the aggregate 
dynamical model (TT8T) and denote with 

x t+ i = Ax t + Bc(x t ), (22) 

the closed-loop dynamics of the entire system. In the following theorem, we state sufficient 
conditions for the asymptotic convergence of the closed-loop system to the set of equilibria £. 

Theorem 1: Assume that 

(AO) Q i y 0,i? y for alH = 1,...,N. 

(Al) For all 9 t E there exists a unique equilibrium x % e (9 t ) 6 X % , u % e (9 t ) E W for all 

% = 1, . . . , N such that x\ = A i x i e + B*m* and CV e = t . 
(A2j The state and input constraint sets X 1 and hi 1 contain all x l e and u\ equilibrium pairs 
in their interior, respectively, for all % — 1, . . . , N. 
Then, the closed-loop system d22l) asymptotically converges to the set of equilibria 8 with domain 
of attraction X . 

Proof: We introduce the following notation: 

T-l 

r (4 ui e t ) M M' ^) ( 23 ) 

fc=0 

and 

N 

j( Xt ,u t ,e t ) = J2^{^ui,e t ). (24) 

i=l 

The optimal value function obtained from solving problem CD) at time t will thus be denoted as 

J*(x t ,u;,e* t ). 

We will show first that the optimal value function J*(x t , U£, 91) decreases along the closed- 
loop trajectories of the overall system at each time step J*(x t +i, U^ +1 , 6* +1 ) < J*(xt, U£, 0%), if 
the assumptions of the theorem hold. 

Let the initial state at time t be x t = x 0>t E X and let C/ t * = [Uq t , . . . , w^-i J an ^ @t be the 
optimizers of problem ©. Denote with = [xo,t, x *tT ■ ■ i x Tt\ m e corresponding optimal state 
trajectory, with ljy ® 9* = Cx* T t . Let x t+1 = x\ t = Ax j + Bu* Q t and consider problem © for 
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time t + 1. We will construct an upper bound for J*(x t+ i, J7 t * +1 ,^ +1 ). Consider the sequence 
U t +i = [u* t , . . . , Ut-iu v ] an d the corresponding state trajectory resulting from the initial state 
x t +i, x t+ i = [ar* t , ... ,x* Tt ,Ax* Tt + The input i/ t+ i will be feasible for the problem at 
t + 1 if and only if u G W keeps C(Ax^ t + Bv) equal to some \ N ® 9 with # 6 9 at step T 
of the prediction, i.e., C(Ax^ t + Bv) = l N ® 9. Such v exists by hypothesis (Al). Since x* Tt 
is an equilibrium of the system, this also allows us to choose a feasible v, for which in fact 
C{ y Ax* Tt + Bv) — Ijv ® 0*. This is accomplished by noticing that (6* f *) and selecting 

v = u e (0* t ). (25) 

J(x i+ i, ?7 t +ij 0*) Wli l be an upper bound for the optimal J*(x t+ i, 9l +1 ). Since trajectories 
generated by and overlap (except for the first and last sampling intervals), it is immediate 
to show that 

J*(x t +i, U£ + i, 0*+i) 

<j(x t+ i,u t+1 ,e* t ) 

=r(xt, U*, 6* t ) - (x 0>t - x e (9* t )) T Q(x ,t - x e {6t)) 

(26) 

- ( W * t - u e {6* t )) J RK t - uM)) 

+ {{Ax* Ttt + Bv) - x e {e* t )yQ{{Ax* Tt + Bv) - x e (6* t )) 

+ (v-u e (9* t )yR(v-u e (9* t )), 
where Q = diag(Q') e KE*"*^" 1 , # = diag(if) e M^ m!x ^ m ". Choosing the particular v 
value given in (1231) leads to Ax* Tt + St> — x e {9* t ) = 0, so equation (|26l) becomes 

J*(x m ,C/ t * +1 ,^ +1 )- J*{x t ,W t ,6*t) 

<-(x ,t-x e (9* t )) J Q(xo,t-x e (9* t )) 

(27) 

- K )t - We (0 t *))TiZ(u* t - 

< - j(\\(x t - x e (9),u t - u e (6))\\), Mx t e X t . 
where 7 is a class K function. This inequality along with hypothesis (AO) on the matrices Q 
and R ensure that J*(x tl U^, 91) decreases along the state trajectories of the closed-loop system 
(EU) for any x t E X t . Since J*(x t , U*, 9*) > for all x t , U*, 9*, it follows that J*(x t , U t *, 9*) -> 
J* as t — > 00, where J* is a nonnegative constant. We conclude that J*(x t+ i,U^ +1 ,6^ +1 ) — 
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J*(x t ,Ut , f *) — > as t — > oo and this implies that 7(||(x t — x e (9),u t — u e (9))\\) — > 0. From 
7(-) being a K function, it follows that x t — x e (9),u t — u e {9) — > as t — > oo. □ 

B. Interrupted Negotiations 

In case the distributed negotiation process is interrupted (e.g., due to execution time constraints) 
or otherwise allowed to run only for a finite number of iterations before the control inputs are 
implemented, the 9\ values do not converge to a common optimal value 9*. This means that 
individual agents will issue control commands that will guide them to possibly close but different 
terminal consensus points. In such a situation, we desire to find conditions under which repeated 
negotiation and implementation of intermediate consensus results will still allow asymptotic 
convergence to a common consensus point for each agent. 

We propose an algorithm that fulfills the above objective if the subgradient iterations in 
subsequent time steps approach the optimal consensus point to an increasingly more accurate 
level and at the same time the local MPC solutions satisfy an improvement property along the 
closed-loop evolution of the agents' dynamics. The first requirement ensures that the mismatch 
between different interrupted 9\ values diminishes as t — > oo. The second requirement is 
analogous to the standard suboptimal MPC scheme in [6], where it is established that feasibility 
of such an improvement constraint implies stability of the receding horizon control scheme. 

In the following, we will denote the last (i.e., implemented) final consensus point reached 
by agent i in the subgradient negotiation process of time instance t by 9\. This intermediate 
consensus point is not optimal for the global optimization problem ©, but due to Assumptions 
[2H3] it is certainly feasible for the following local problem: 



Distinguishing between the local 9\ variables allows the original global optimization problem 
© to be restated as 



mm 




(28a) 



subj. to 9 l t eO. 



(28b) 



N 



mm 

0t 




(29) 



i=i 



subj. to 9\ 



t ~ 



e? g e. 
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As opposed to the fully converged subgradient scheme in Section ITV- A[ the 9\ variables do not 
converge to the globally optimal one, thus we cannot rely on optimality of the MPC scheme to 
prove global convergence. Instead, an improvement property as shown in [6], which is required 
for asymptotic convergence to the set of equilibria will be formulated as 

N 

u l t+x , ei +1 ) - j (x\, uj, 0*)) 

£i (30) 
<- 7 (||(x t -x e (0),M t - We (0))||), 
where 7 is a class K function. A feasible sequence for such a constraint always exists based on 
Assumptions [2H3] and the earlier developments in Section HV-Al 

The value function improvement property in (1301) is not sufficient for convergence to the global 
optimum, since the common terminal point constraint is missing and the local 9\* values are in 
general different. Thus, if the agents' initial states are close to their local x l e (9 l t *) equilibria, which 
are significantly different from each other, then any subgradient-based or other adjustment of the 
local terminal point constraints towards the globally optimal 9* t value would necessarily result 
in both local and global cost increase. 

This suggests that an additional requirement besides the cost improvement property is needed, 
which ensures that the 9\ values will also converge over time to a common 9 t . This can be 
accomplished by requiring that in each iteration the subgradient-based negotiation scheme is 
executed at least until 

\\e\-9l\\<e t Vi = l,...,JV, (31) 
where the approximation bound is updated for instance according to 

e t < e > 0. (32) 

In order to have some information about the required number of incremental subgradient 
iterations that guarantee fulfillment of constraints (1301) and (f3TT) . we will make use of the following 
result from [7]. It can be shown that under a strong convexity type assumption, the incremental 
subgradient method defined earlier in (fT2~l) - (fl4l) with an appropriately chosen stepsize a(k) has 
a sublinear convergence rate: 

Proposition 1: [7] Let Assumptions |4] and [5] hold, and assume that there exists a positive 
scalar \i such that 

q(x t , 9 t ) - q*(x t ) > ft (dist(0 t , <d* t )) 2 , V0 t E 6. (33) 
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Then for the sequence {9 t {k)} generated by the incremental subgradient method with the stepsize 
of a(k) = j^j^f, /i > 0, we have 

{dism(k + 1) ,et)f<^t^^. (34) 

In the following, we describe a scheme, which allows the two conditions (1301) and (1311) to be 
tested based on the cyclic communication scheme underlying the subgradient-based negotiation. 

In Algorithm [H agents perform cyclic iterations of the subgradient (SG) method (TT2)) - (fT4l) . 
They execute at least the number of iterations dictated by the optimal 6$ approximation require- 
ment in (TUT) . Satisfaction of the test (T3T1) is signaled by a flag /so- If needed, agents continue 
with subgradient iterations until the global cost improvement property in (1301) is satisfied. This 
is signaled by flag / DMPC . 

In order to accomplish this, agents pass along besides their current subiterate "d\{k) of the 
consensus point in iteration k at sampling time t, the two binary variables (flags) /dmpc and 
fsG corresponding to tests (1301) and (T3TI) . and two vectors of dimension N: J curr and J prev . These 
vectors contain the individual cost values associated with the current and previous sampling time, 
respectively. J prev has values corresponding to the cost of using the final consensus points 

for implementation during the previous sampling time t—1. The current cost J curr gets filled up 
cyclically using the most recent subiterate for each agent. 

When an agent computes its own consensus point subiterate, it calculates the corresponding 
local cost value and checks the sum of previous and current cost values for each agent to decide 
whether the improvement property (1301) is satisfied. If it is, then it sets a flag /dmpc> which 
indicates that the improvement property (1301) is fulfilled and every other agent should enter in 
an implementation phase, provided that condition ((311) is also satisfied. The message reaches all 
other agents eventually as they pass along this information in a cyclic pattern. If property (1301) 
is not satisfied, then it puts its current cost value entry in the vector J curr and passes it on to the 
next agent. 

Theorem 2: Under the assumptions of Section HH Algorithm [TJ converges asymptotically to 
the set of equilibria S. 

Proof: The main idea of the proof follows along the lines of Theorem [T} except for two 
crucial points. A feasible sequence for the improvement constraint (|30l) always exists based on 
Assumptions [2H3] and the developments in Section IIV-AI This improvement property guarantees 
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that even with interrupted negotiations, the distributed MPC problem converges asymptotically to 
some set of different terminal points (since Q\ are different in this case). However, these terminal 
points are guaranteed to form a single consensus point, attained asymptotically by the repeated 
application of the iterative subgradient method, due to (I3TT) and the compactness of 0. □ 

Remark 3: Although Algorithm \T\ guarantees global convergence, it requires an increasing 
number of subgradient iterations in subsequent time steps in order to approach the optimal value 
with a decreasing tolerance. The requirements (|30l) and (|3TT) are only sufficient conditions and 
thus might be somewhat conservative. Decreasing the initial stepsize of the subgradient iterations 
may solve this problem. The increasing number of subgradient iterations can also be alleviated 
in practice in the following way: Once e t gets small enough or another condition indicating 
closeness to the global consensus point is satisfied, the 6 consensus point can be fixed for all 
agents and the scheme could proceed with a pure decentralized MPC scheme. This would ensure 
convergence due to the result shown in Section HV-Al 

V. Numerical example of an aerial refueling scenario 

This numerical example considers a simplified aerial refueling scenario with three aircraft, 
which illustrates the importance of negotiating an optimal consensus point and the difference 
from standard rendezvous problems. The scheme involves the linearized longitudinal dynamics 
of three aircraft representing the tanker and two smaller aircraft to be serviced, respectively. 
The simplified objective is to control all three aircraft from their initial status to the same 
altitude and airspeed, which will allow the refueling process to be initiated. The consensus 
variable t = [Ah AV] J will thus be a vector of dimension two, with entries of altitude and 
true airspeed deviations from the trim values, respectively. The optimal rendezvous altitude and 
airspeed will be determined using the distributed (cyclic) negotiation process based on local 
subgradient iterations. Once the corresponding control actions are implemented, the negotiations 
start again in the subsequent sampling time from the new initial states. This receding horizon 
procedure controls all three aircraft to a common altitude and airspeed. The optimal choice of the 
rendezvous parameters depend on the dynamics and constraints of the individual aircraft, along 
with the weighting matrices of the local cost functions, which penalize deviations in altitude, 
velocity and the use of control actions. 

The tanker is represented by a Boeing 747-100/200 aircraft model based on [8]. The two 



February 29, 2008 



Technical Report 



15 



serviced vehicles are modeled as F-16 aircraft from [9], [10]. The models were all discretized 
with a sampling time of 0.05 s. The cost function of the tanker penalizes altitude-changes heavily, 
and the cost functions associated with the two fighters are based on their health or fuel level. 

The linearized longitudinal models of all aircraft involved correspond to a straight and level 
flight condition at 4000 m altitude and 184 m/s true airspeed. The control inputs of the linearized 
Boeing 747-100/200 aircraft represent deviations from the trim values of thrust and elevator 
deflection, denoted by u B141 = [Sf™ 1 5 e ] T , respectively. The F-16 aircraft models are equipped 
with an inner-loop pitch rate controller so the control inputs of their linearized models are 
deviation from trim throttle and desired pitch rate command u Fl6 = [5™ 5 qcmd ] T , respectively. 
The constraints on the aircraft inputs are defined as the following 

"+150000 lb 
+ 10deg 
+5000 lb 
+ 100 deg/s_ 

The initial altitudes of the two F-16s are chosen as ±30.48 m around the trim altitude. The 
tanker's initial altitude w.r.t. the trim value is —10 m. All aircraft are initialized as flying straight 
and level with the same trim velocity. 

If we simply considered the average value of the initial altitudes as a rendezvous point, 
then the aircraft should approach approximately —3 m. Due to the different dynamics and cost 
functions of the aircraft representing fuel cost and vehicle health priorities, this average value is 
not optimal. We consider the following weighting matrices in the local cost functions 
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These weights intend to mimic a situation, where the F-16 indexed as #1 penalizes altitude 
changes and pitch control more heavily due to a restriction posed by limited fuel supply or 
elevator control authority. 
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Simulation results of the centralized solution to the optimal finite-time rendezvous problem 
with time horizon of T = 100 steps (5 seconds) are shown in Figure [Q The optimal rendezvous 
point is determined to be at 21.2340 m away from the trim altitude and the common terminal 
velocity at straight and level flight equilibrium is —1.1933 m/s away from the initial trim velocity. 
Notice that the rendezvous altitude is quite close to F-16 #1 as an indication of its limited 
maneuvering capabilities represented in its local cost function. 

The problem can be solved in a distributed fashion using the incremental subgradient method 
described in [2] and implemented through a cyclic sequential update. Since the optimal ren- 
dezvous point and the corresponding control solutions converge to the centralized one asymp- 
totically, only convergence of 0\{k) are shown in Figure [31 

In order to have a receding horizon implementation of the incremental subgradient method 
proposed in Algorithm [H the maximum number of major subgradient iterations were limited to 
15 before applying the local control solutions at each new MPC update. The limited number 
of subgradient iterations lead to a mismatch between local versions of the common rendezvous 
point 9\. Still, the MPC implementation of the finite-time optimal rendezvous problem with local 
rendezvous point mismatch stabilizes the system and provides acceptable performance besides 
providing feedback action. The resulting trajectories are shown in Figure [2] until sampling time 
t = 100. The convergence of Of 14 ' '(k) in each MPC update period is shown in Figure SI where 
one horizontal axis represents the subgradient iterations k (limited at 15) and the other horizontal 
axis corresponds to the MPC sampling time index t. Notice that besides the improvement made 
in each MPC sampling period, there is also a trend indicating that the local 6f 1A1 (k) value 
starts from a closer point to 9*(k) and approaches it more and more closely in subsequent MPC 
updates. 

VI. Conclusions 

We have introduced a distributed model predictive control (DMPC) framework, where the 
control objective is to agree upon and achieve an optimal consensus point for constrained dynamic 
agents. The negotiation scheme makes use of the cyclic incremental subgradient algorithm 
described in [2]. Convergence properties of the combined DMPC / incremental subgradient 
approach were analyzed and a sufficient minimum number of subgradient iterations were es- 
tablished. An algorithm was proposed that ensures convergence of the decentralized scheme. 
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Fig. 2. MPC implementation of the finite-time optimal rendezvous problem with subgradient iterations limited at 15. 
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Besides the aerial refueling numerical example shown in Section |Vj the approach proposed in 
this paper can be used in other distributed "synchronization" problems as well, where agents with 
constrained dynamics have to agree upon and achieve simultaneously an "optimal" consensus 
value, which is not known a priori. Our current work considers schemes that relax the cyclic, 
sequential communication requirement and rely on parallel, localized iterations. 
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Algorithm 1: Cyclic incremental DMPC algorithm 



1 Initialize (3, fj,, e, 9 l ; 

- false; 



2 /DMPC, /sG 

3 k,t < — 0; 

4 j curr g n 



Jprew 

loop 



■ AT 



0; 

—M 



/* M is large number */ 



Measure states x 
repeat 

a(/c) < 2^FPT ; 

^— t (fc); 
for i = 1 to iV do 
if /dmpc A / sg then 

Implement u % Q t (d\(k — 1)); 

0j(fc) < l?j(lfe- 1) 

se 



Compute a ^(fc) e dq i (x\,'&\ l {k)); 
^{k)^V @ [^-\k)-a(k)g\k)\- 

Jean 4 Jt\ X ti ^t)^tw)> 

if Eii^w - 4J < then 
j Set /dmpc true; 
else 

i Set /dmpc false; 
end 

end 
end 

e t (k + i) 

jj. l+ln(fc+l) TV 2 /? 2 



< f then 



fe+l 4ti 2 — 

; Set / SG true; 
else 

| Set / SG false; 
end 

k < — k + 1; 
until new measurement is available 

t< — i + 
k < — 0; 



37 end loop 
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